Inline filtering for vector sets#1890
Open
metajack wants to merge 7 commits into
Open
Conversation
c42adbb to
64a83a7
Compare
64a83a7 to
56d244c
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates Garnet’s vector set similarity search (VSIM) to support inline filtering via a DiskANN → C# callback (instead of relying on over-retrieval + post-filtering), and updates the public docs/tests around filtering and FILTER-EF.
Changes:
- Adds an unmanaged inline-filter callback wiring from DiskANN (Rust) into Garnet (C#) and threads per-query compiled filter state via
[ThreadStatic]. - Changes
FILTER-EFsemantics/limits (default16, range[4, 256]) and updates validation + tests accordingly. - Introduces a documented binary attribute encoding/extraction path intended to accelerate filter evaluation, and adds a new design doc describing the end-to-end approach.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| website/docs/dev/filtered-search-design.md | New end-to-end design doc for filtered vector search and inline filtering. |
| website/docs/commands/vector-sets.md | Updates VSIM option docs for FILTER-EF and inline filtering behavior. |
| test/standalone/Garnet.test.vectorset/RespVectorSetTests.cs | Adds/updates VSIM filter validation tests and new “bad filter” cases. |
| test/standalone/Garnet.test.extensions/DiskANN/DiskANNServiceTests.cs | Updates DiskANN index creation tests for the new callback parameter. |
| libs/server/Resp/Vector/VectorManager.Migration.cs | Passes the new inline filter callback when recreating indexes. |
| libs/server/Resp/Vector/VectorManager.Locking.cs | Passes the new inline filter callback when creating/recreating indexes. |
| libs/server/Resp/Vector/VectorManager.Filter.cs | Adds thread-static inline filter state + candidate evaluation logic. |
| libs/server/Resp/Vector/VectorManager.cs | Switches VSIM paths toward inline filtering setup and bitmap sizing helper. |
| libs/server/Resp/Vector/VectorManager.Callbacks.cs | Adds the unmanaged callback entrypoint to call into filter evaluation. |
| libs/server/Resp/Vector/VectorFilterExpression.cs | Simplifies ExprProgram by removing redundant length fields. |
| libs/server/Resp/Vector/RespServerSessionVectors.cs | Updates FILTER-EF parsing/validation defaults and bounds. |
| libs/server/Resp/Vector/ExprRunner.cs | Iterates using program.Instructions.Length instead of removed program.Length. |
| libs/server/Resp/Vector/DiskANNService.cs | Extends create/recreate index P/Invoke signature to include filter callback. |
| libs/server/Resp/Vector/AttributeExtractor.cs | Adds binary attribute conversion/extraction APIs and minor JSON parsing cleanup. |
| Directory.Packages.props | Bumps diskann-garnet package version. |
Comment on lines
780
to
786
| // Apply post-filtering if filter is specified | ||
| if (!filter.IsEmpty) | ||
| { | ||
| // Ensure bitmap is large enough for the over-retrieved result set | ||
| var requiredBitmapBytes = (found + 7) >> 3; | ||
| if (requiredBitmapBytes > filterBitmap.Length) | ||
| { | ||
| if (!filterBitmap.IsSpanByte) | ||
| { | ||
| filterBitmap.Memory.Dispose(); | ||
| } | ||
|
|
||
| filterBitmap = new SpanByteAndMemory(MemoryPool<byte>.Shared.Rent(requiredBitmapBytes), requiredBitmapBytes); | ||
| } | ||
| EnsureFilterBitmapSize(ref filterBitmap, found); | ||
|
|
||
| _ = ApplyPostFilter(filter, found, outputAttributes.ReadOnlySpan, filterBitmap.Span, ActiveThreadSession.scratchBufferBuilder); | ||
| } |
Comment on lines
+861
to
+868
| var instrCount = ExprCompiler.TryCompile(filter, instrBuf, tuplePoolBuf, tokensBuf, opsStackBuf, out var tupleCount, out _); | ||
| if (instrCount < 0) | ||
| { | ||
| outputDistances.Length = 0; | ||
| filterBitmap.Length = 0; | ||
| outputIdFormat = VectorIdFormat.I32LengthPrefixed; | ||
| return VectorManagerResult.OK; | ||
| } |
Comment on lines
+276
to
+306
| // 1. Read external ID for this internal_id via ExtMap | ||
| Span<byte> iidKey = stackalloc byte[sizeof(uint)]; | ||
| BinaryPrimitives.WriteUInt32LittleEndian(iidKey, internalId); | ||
|
|
||
| Span<byte> eidBuf = stackalloc byte[128]; | ||
| var eidMem = SpanByteAndMemory.FromPinnedSpan(eidBuf); | ||
| try | ||
| { | ||
| if (!ReadSizeUnknown(context | DiskANNService.ExternalIdMap, true, iidKey, ref eidMem)) | ||
| return 0; // can't find external ID → exclude | ||
|
|
||
| // 2. Read attributes by external ID | ||
| Span<byte> attrBuf = stackalloc byte[256]; | ||
| var attrMem = SpanByteAndMemory.FromPinnedSpan(attrBuf); | ||
| try | ||
| { | ||
| if (!ReadSizeUnknown(context | DiskANNService.Attributes, true, eidMem.ReadOnlySpan, ref attrMem)) | ||
| return 0; // no attributes → exclude | ||
|
|
||
| // 3. Rebuild ExprProgram from thread-static state pointers | ||
| var program = new ExprProgram | ||
| { | ||
| Instructions = state.InstrBuf, | ||
| TuplePool = state.TuplePoolBuf, | ||
| RuntimePool = state.RuntimePoolBuf, | ||
| RuntimePoolLength = 0, | ||
| }; | ||
|
|
||
| program.ResetRuntimePool(); | ||
|
|
||
| AttributeExtractor.ExtractFields(attrMem.ReadOnlySpan, state.FilterBytes, state.SelectorRanges, state.ExtractedFields, ref program); |
Comment on lines
+565
to
+569
| output[pos] = 8; | ||
| output[pos + 1] = 0; | ||
| pos += 2; | ||
| _ = BitConverter.TryWriteBytes(output[pos..], numVal); | ||
| pos += 8; |
Comment on lines
+684
to
+688
| if (valueLen == 8) | ||
| { | ||
| var numVal = System.BitConverter.ToDouble(binary[pos..]); | ||
| results[matchIndex] = ExprToken.NewNum(numVal); | ||
| } |
|
|
||
| This would reduce the per-candidate cost to **zero extra store reads** — the only remaining overhead is the binary field scan and expression evaluation. | ||
|
|
||
| ### Further with attibute index: Pre-built attribute index to replace per-candidate filter evaluation |
Comment on lines
+34
to
+43
| ### Solution: Add a second attribute store optimized for query-time filter evaluation | ||
|
|
||
| The current change **adds a new attribute store** alongside the existing one. The two stores serve different purposes: | ||
|
|
||
| | Store | Keyed by | Format | Purpose | | ||
| |-------|----------|--------|---------| | ||
| | Existing | External ID (user key) | Raw JSON | RESP command operations (`VGETATTR`, `VSETATTR`, etc.) | | ||
| | **New** | Internal ID (DiskANN ID) | Binary | Inline filter evaluation at query time | | ||
|
|
||
| The existing external ID keyed JSON store is untouched — it continues to serve all RESP command operations. The new internal ID keyed binary store is a **write-time derived projection** of the same data, optimized purely for the inline filter callback's access pattern. |
Comment on lines
946
to
952
| // Apply post-filtering if filter is specified | ||
| if (!filter.IsEmpty) | ||
| { | ||
| // Ensure bitmap is large enough for the over-retrieved result set | ||
| var requiredBitmapBytes = (found + 7) >> 3; | ||
| if (requiredBitmapBytes > filterBitmap.Length) | ||
| { | ||
| if (!filterBitmap.IsSpanByte) | ||
| { | ||
| filterBitmap.Memory.Dispose(); | ||
| } | ||
|
|
||
| filterBitmap = new SpanByteAndMemory(MemoryPool<byte>.Shared.Rent(requiredBitmapBytes), requiredBitmapBytes); | ||
| } | ||
| EnsureFilterBitmapSize(ref filterBitmap, found); | ||
|
|
||
| _ = ApplyPostFilter(filter, found, outputAttributes.ReadOnlySpan, filterBitmap.Span, ActiveThreadSession.scratchBufferBuilder); | ||
| } |
Comment on lines
384
to
+385
| | `FILTER expr` | _none_ | Post-filter results by an attribute expression (see [Filter Expressions](#filter-expressions)). | | ||
| | `FILTER-EF n` | `min(COUNT * 200, 100000000)` | Maximum number of nearest neighbors to **inspect** before filtering. Must be in `[0, 100000000]`. | | ||
| | `FILTER-EF n` | `16` | Scale factor for adaptive inline filter search. Must be in `[4, 256]`. This controls how high the EF will scale based on selectivity. | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Based on Haiyang's original two queue work, this rebased onto the quantization branch and modifies it for the new inline filtering with adaptive L.